Wan 2.2 AI Video Generator
Wan AI is an advanced and powerful visual generation model developed by Tongyi Lab. It can generate videos based on text, images and other control signals. The Wan 2.2 series models are now fully open-source after Wan 2.1.
Wan AI Video Generators
Wan 2.1
Open SourceAdvanced open-source video generation model with exceptional quality and versatility. Perfect for professional content creation.
Text to Video Example
See how Wan 2.1 transforms text into stunning videos
Real-timeA couple in formal evening attire is caught in heavy rain on their way home, holding a black umbrella. In the flat shot, the man is wearing a black suit and the woman is wearing a white long dress. They walk slowly in the rain, and the rain drips down the umbrella. The camera moves smoothly with their steps, showing their elegant posture in the rain.
Key Features
- ✓High-quality video generation
- ✓Text-to-video & Image-to-video
- ✓Open source availability
Wan 2.2 Fun Control
Enhanced control and creative freedom with the latest Wan AI technology. Experience unprecedented precision in video generation.
Generation Example
Advanced motion control and style transfer
Real-timeReference Character
InputReference Motion
InputGenerated Result
OutputCombining character style with reference motion to create personalized video content.
Advanced Features
- ✓Advanced control features
- ✓Improved video quality
- ✓Enhanced creative options
Wan 2.2
Experience the next generation of AI video generation with enhanced quality, precise control, and creative possibilities.
Advanced Control
Precise control over video generation with enhanced creative options
High Performance
Optimized processing for faster and more efficient video generation
Quality Output
Superior video quality with enhanced detail and consistency
Versatile Input
Support for multiple input types and creative workflows
Featured Examples
Style Transfer Example
Motion Generation
Creative Effects
Wan Video LoRA
Specialized video adaptation using LoRA technology. Create unique and personalized video styles with minimal training.
Specialized Features
- ✓Custom style adaptation
- ✓Fast fine-tuning capabilities
- ✓Efficient resource usage
- ✓Advanced style transfer
Wan AI Image Generators
Qwen Text-to-Image
AI-Powered Image Generation
Natural Language Understanding
Generate images from natural descriptions in Chinese or English, supporting classical poetry to modern expressions
High-Definition Output
Ultra-detailed rendering with exceptional clarity, perfect for professional content creation
Style Control
Precise style control with simple keywords, from anime to photorealistic rendering
Example Output
Generated from natural language description
Qwen Image Edit
Precise Image Editing & Enhancement
Key Features
Smart Text Editing
Intelligent font matching and style preservation for text modifications
Object Replacement
Seamless object swapping with automatic lighting and reflection adjustment
Effect Generation
Add professional visual effects with simple brush strokes
Draw to Image Workflow
Select Area
Circle or mark region
Draw Input
Sketch your changes
Describe
Add text instructions
Overview of Wan AI
SOTA Performance
Wan 2.2 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks.
Supports Consumer-grade GPUs
The T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate a 5-second 480P video on an RTX 4090 in about 4 minutes (without optimization techniques like quantization). Its performance is even comparable to some closed-source models.
Multiple tasks
Wan 2.2 excels in Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio, advancing the field of video generation.
Visual Text Generation
Wan 2.2 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications.
Powerful Video VAE of Wan AI
Wan-VAE delivers exceptional efficiency and performance, encoding and decoding 1080P videos of any length while preserving temporal information, making it an ideal foundation for video and image generation.
Features of Wan AI
Complex Motions by Wan AI 2.2
Excels at generating realistic videos featuring extensive body movements, complex rotations, dynamic scene transitions, and fluid camera motions.
Physical Simulation by Wan AI 2.2
Generates videos that accurately simulate real-world physics and realistic object interactions.
Cinematic Quality by Wan AI 2.2
Offers movie-like visuals with rich textures and a variety of stylized effects.
Controllable Editing by Wan AI 2.2
Features a universal editing model for precise edits using image or video references.
Visual Text Generation by Wan AI 2.2
Creates text and dynamic text effects in videos directly from text prompts.
Product Features
Through our product, you can seamlessly leverage our models with a user-friendly experience to access inspiring video content.
Text to Video
Image to Video
Start and End Frames
Wan AI 2.2 Open Source
In this repo, we release the code and weights for the Wan 2.2, a comprehensive and open suite of video foundation models designed to push the boundaries of video generation.
The I2V-14B model outperforms leading closed-source models as well as all existing open-source models, achieving SOTA performance. It is capable of generating videos that demonstrate complex visual scenes and motion patterns based on input text and images, including both 480P and 720P resolution models.
Wan2.2-T2V
480-720PThe T2V-14B model sets a new SOTA performance among both open-source and closed-source models, showcasing its ability to generate high-quality visuals with substantial motion dynamics. It is also the only video model capable of producing both Chinese and English text and supports video generation at both 480P and 720P resolutions.
Wan2.2-T2V-1.3B
480PThe T2V-1.3B model supports video generation on almost all consumer-grade GPUs, requiring only 8.19 GB of BRAM to produce a 5-second 480P video, with an output time of just 4 minutes on an RTX 4090 GPU. Through pre-training and distillation processes, it surpasses larger open-source models and achieves performance even comparable to some advanced closed-source models.
Wan2.2-FLF2V-14B-720P
Wan 2.1 First-Last-Frame-to-Video (FLF2V) is an AI-based video generation technology that synthesizes intermediate frames between a given start and end frame to produce smooth videos. It leverages a 14B-parameter model, supports multi-GPU accelerated inference, and offers pretrained checkpoints with a Gradio demo for interactive testing. Applications include video inpainting, animation production, and more.
Alibaba Wan2.2 – Now Available!
Next-Gen Upgrade, Beyond Limits
The all-new Wan2.2 is here, delivering enhanced performance, higher efficiency, and smarter capabilities!
Blazing-Fast Computing with Wan2.2
Experience peak performance with Wan2.2's optimized architecture
Ultra-Low Latency
Achieve unmatched network transmission efficiency with Wan2.2
Broad Compatibility
Wan2.2 seamlessly supports diverse business scenarios
AI-Powered Optimization
Enjoy intelligent auto-tuning with Wan2.2
Explore Wan2.2 Today!
Discover the latest Wan2.2 features and capabilities!
Frequently Asked Questions
What is Wan2.2 by Wan AI and how does it work?
Wan2.2 by Wan AI is Alibaba Cloud's state-of-the-art video generation model that transforms text descriptions into stunning, high-quality videos. Leveraging advanced technologies like Variational Autoencoders (VAE) and Diffusion Transformers (DiT), it ensures realistic visuals, smooth transitions, and accurate physics for a truly immersive experience.
Do I need technical expertise to use Wan 2.2 by Wan AI?
Wan 2.2 by Wan AI is designed with simplicity in mind. Its intuitive interface allows anyone to create professional-quality videos effortlessly, even without advanced technical skills. Whether you're a beginner or a pro, you'll find the platform easy to navigate and use.
What types of videos can I create with Wan 2.2 by Wan AI?
Wan 2.2 by Wan AI is versatile and capable of generating a wide range of video content. From dynamic scenes like dancing and sports to educational tutorials and historical video restoration, it empowers you to bring your creative vision to life.
How long does it take to generate a video?
The video generation time depends on the complexity and length of your project. For faster results, the Pro version offers accelerated processing speeds, making it ideal for time-sensitive tasks.
Can I customize the video output?
Absolutely! Wan 2.2 by Wan AI provides extensive customization options, allowing you to adjust resolution, frame rate, movement complexity, and more. Tailor your videos to meet your specific needs and preferences.
What input formats does Wan 2.2 AI by Wan AI support for video generation?
Wan 2.2 AI by Wan AI primarily supports text descriptions as input for video generation. You can provide detailed textual prompts describing the scene, actions, and desired visual effects. Additionally, it may support image inputs for enhanced context in future updates.
Can Wan 2.2 AI by Wan AI generate videos in multiple languages?
Yes, Wan 2.2 AI by Wan AI supports multilingual text inputs, allowing you to generate videos based on descriptions in various languages. However, the quality of output may vary depending on the language and the complexity of the description.
Is there a limit to the length of videos that Wan 2.2 by Wan AI can generate?
The length of generated videos depends on the subscription plan. The free version may have limitations on video duration, while the Pro version supports longer and more complex video generation. Specific limits can be found in the platform's documentation.
How does Wan 2.2 by Wan AI ensure the quality of generated videos?
Wan 2.2 AI by Wan AI leverages advanced technologies like Variational Autoencoders (VAE) and Diffusion Transformers (DiT) to ensure high-quality outputs. These technologies enable realistic visuals, smooth transitions, and accurate physics simulations.
How does Wan 2.2 by Wan AI handle complex scenes with multiple characters?
Wan 2.2 by Wan AI is designed to handle complex scenes with multiple characters by analyzing the relationships and interactions described in the text input. It uses advanced algorithms to ensure realistic positioning, movements, and interactions between characters.